Hip-hop music has been a major cultural force for several decades, with its roots in the African American and Latino communities of the Bronx in the 1970s. Over time, the genre has evolved and branched out into various sub-genres, including gangsta rap, conscious hip hop, and trap music, to name a few. In this course I’ve been doing research on the difference and similarities between popular ‘old school hip-hop’ from the 90s and popular modern day rap from around 2017 to 2023. Rap is now considered as one of the most popular genres in contemporary music. I’m interested in how the music has changed and if modern day rap has significant differences compared to 90s hip-hop that can be confirmed by using computational musicology; but I’m also interested in possible similarities. Therefore my corpus will contain a merged group of tracks from both these periods. In the presented analysis of this research I will refer to the popular hip-hop music of the 90s as ‘old-school’ and the popular rap music from around 2017 to 2023 as ‘modern’.
As I said, the natural groups and comparison points are popular tracks from both these time spans. One of my hypotheses is that I expect to see an increase in aspects like loudness because of the way music and hip hop beats get produced these days; but I am very unsure how other factors like energy, valence and danceability have changed between these two groups. Besides that I also assume that modern rap will feature more electronic and digital production elements than 90s hip-hop. I hope to determine these differences by looking at various aspects, such as timbre.
One strength of my corpus is that I respectively checked, with the help of popularity charts and the stream counts on Spotify, which tracks were the most popular in these time spans in both the 90s and around 2017 to 2023 and used these as a reference point. Hereby I ruled out a certain form of subjectivity and my own taste of tracks because I only focused on tracks that are popular and have a lot of streams. So despite that these two time spans differ a lot and the genre of hip hop/rap has drastically evolved; my comparison groups have one thing in common, they both consist of only really popular tracks. This also provides some interesting research questions: On the one hand, can we determine a notion of unity in both time spans that only consists of popular tracks, or is there a lot of fluctuation? And on the other hand, can we identify common characteristics for the whole corpus?
The SpotifyAPI track features provide a lot of information about how the tracks are classified and what different musical characteristics and qualities they convey. These track features therefore offer a reliable foundation to start off this research. You can find the playlist of the corpus here.
This is an AI generated image, made on Craiyon, of two very well known artists: on the left we see Post Malone and on the right Jay-Z. A fun blend of two iconic figures from different eras, with Jay-Z representing the 90s hiphop sound and Post Malone embodying the modern day rap style.
To begin the visualisation of the selected corpus let’s take a look at two variables called energy and valence. How do they differ from the 90s to the modern day rap music and is there a correlation between the two variables? The measurement for both variables ranges from 0.0 to 1.0. Valence is referring to the emotional quality the music conveys: 1 being very positive, happy and/or uplifting and 0.0 being angry, regretful or sad. Energy gives us an idea about the intensity and the activity of the track. A really energetic track would feel rousing for example. We can see that there is more fluctuation in the modern rap music when we look at valence, the old-school graph has more tracks with higher valence compared to the later period, consisting of predominantly values above 0.5. Overall we see that the most popular hip-hop/rap songs from both time spans generally have high energy values, most of them being around 0.5 or higher. It is difficult to determine a clear correlation between energy and valence in these popular hip-hop/rap songs from both periods: because when we look at the old-school graph the highest values of energy are both located at the lowest and highest values of valence. One possible explanation for this could be that either low or high valence results in high energy. It’s important to mention that there are exceptions to this though, such as “Check the Rhime” from A Tribe Called Quest released in 1991 (the blue dot in the upper left corner of the “old-school” graph). There is one significant difference we can conclude between the old-school and modern graph: the modern period’s highest energy values do not correlate with either high or low valence values, where the old-school period has the highest energy on the extreme values of valence, both high and low. You can hoover over the dots of the scatter plot to see the track names, the exact release date and the exact values for both variables; if you are familiar with 90’s hip-hop or popular rap songs from recent years, you will probably recognize some of the track names!
These graphs examine the relationship between speechiness and danceability from both time periods (you can highlight one of the two periods by double-clicking on it in the legend). Danceability measures how suitable a track is for dancing, it’s primarly based on the rhythm, tempo and beat strength. It ranges from 0.0 to 1.0, with higher values indicating a greater likelihood of the track being danceable. Speechiness, on the other hand, measures the presence of spoken words in a track. It also ranges from 0.0 to 1.0, with higher values indicating that the track contains more ‘spoken word-like’ vocals. Without going in too much detail here, we are looking at two graphs which contain exactly the same data; the only differences are the trend lines, these are calculated by different methods called LOESS and GAM. One interesting and very extreme outlier here is “Yes Indeed” from Lil Baby featuring Drake; having both the highest danceability (almost at a maximum of 1.0) and the highest speechiness of the whole corpus. Because of this I’ve used two different methods for determining the trend lines. For the old-school period, both methods showed a relatively stable relationship between speechiness and danceability, with a upward trend at the beginning, peaking in between the speechiness values of 0.2 and 0.3. These findings suggest a significant trend in the data during the old-school period, where an increase in speechiness is associated with an initial increase in danceability, followed by a subsequent decrease. The cluster of data points (between the 0.2-0.3 range of speechiness) in the old-school period also nicely align with the peaks of both lines. The modern period also aligns and crosses with the old-school trend line around the spot of 0.23 speechiness, using the LOESS method. However, looking at this trend line of the modern period, we can see that the outlier “Yes Indeed” has a significant impact on the observed trend regarding the LOESS method. The GAM method, on the other hand, shows a horizontal line; which would indicate that the increase of speechiness doesn’t influence the danceability of the modern period in a significant way, whilst the old-school period tends to show a significant non-linear relationship using both methods. All in all we can conclude that there would be more analysis needed to confirm these assumptions but that the results are remarkable;“Yes Indeed” has a very decisive impact on the modern period, but it’s also an important point of data which we can’t ignore. Another relevant general observation regarding both groups, is that the popular rap music significantly consists of more tracks which have less speechiness compared to the 90s popular hip-hop and that these tracks tend to show more fluctuation regarding the danceability. The graphs also confirms one observation made earlier: this whole corpus largely consists of very high danceability.
Timbre is everything about an audio file that is separated from the sound qualities pitch, duration and volume. It can be seen as a tone color or tone quality. For example, think about a violin and a piano that would both play a musical tone like C4: this tone could have the same pitch, same duration and the same volume but will differ in timbre. Timbre is a comprehensive concept that is sometimes difficult to put into words and can therefore be difficult to analyse. This graph contains the 12 timbre coefficients that Spotify uses to analyse an audio file and this is pretty ambiguous to interprete; the only people who have assigned a real meaning to them are Spotify engineers, and they’ve kept information about this exact meaning of these values strictly internal. When we take a quick look, there seems to be a lot of similarity between the two, on the other hand there are also significant differences in shape and range in certain coefficients, for example at c02, c05, c07, c08 and c11, this confirms that the timbre of both groups differ. The first coefficient actually is mainly based on the loudness variable; so besides the fact volume usually isn’t a factor in timbre, Spotify uses loudness as one of their timbre coefficients. Try to take a good look at c01 before going to the next page, or feel free to come back to this page after visiting the next, although small, this graph has visualized the specific values of the loudness variable of this corpus very accurately and it is fun to see how you can recognize the shape of the boxplots I will present on the next page into these shapes!
One of my hypotheses was that modern rap music would have higher loudness than 90s hiphop music. This boxplot shows that the modern rap music has a higher median, but that the 90’s actually has the highest max loudness value of the whole corpus! The range of loudness for 90s hip hop music (from -14.73 to -2.43) is way wider than that of modern rap music (from -9.31 to -3.37, when we leave out the outlier), this indicates that there is greater variability in loudness within 90’s hip hop music; the interquartile range from both boxplots are a good visual representation for this. Overall, the boxplots suggest that modern rap music tends to be louder than 90’s hip hop music on average; because it has a more concentrated distribution of loudness values and has a higher median whilst the IQR range is way smaller. Nevertheless, it is noteworthy to mention that the third quartile of the 90’s hip hop is higher than the modern rap music and, like I mentioned earlier, the loudest track of this corpus is from the 90’s. However, the possibility of these tracks being remastered or other factors that may have affected the loudness of the music cannot be ruled out based on the boxplots alone and this could be a really decisive factor on why the 90’s boxplot has the highest max loudness value. This corpus only consists of the most popular tracks and therefore it makes sense to assume these tracks have been remastered before they where uploaded on Spotify.
For this graph we are using looking at the mean tempo of all the tracks of the corpus, the standard deviation is shown on the y-axis. One first observation when looking at this graph is that the standard deviation overall is very low; we can conclude that there is not a lot of variance in the calculated means of the tempi. Something we would expect with hiphop and rap music in general, and therefore also from both these periods, as the beats and rhythm usually are very repetitive and steady. It is a logical conclusion that this also explains why this type of music generally has such high danceability values, as we saw earlier. Besides this we can see a clear difference between the oldschool hiphop from the 90’s and the modern rap music. As the oldschool hiphop music (apart from the 3 outliers on the right) has significantly lower mean tempo than the modern rap music. One interesting observation, something we saw earlier in the valence and energy graph too, is that the modern rap music shows a lot of fluctuation compared to the 90’s period. We can see a steady cluster of some points around 140/150 BPM but also a lot of different values of BPM, where the oldschool really has a steady cluster off all the points, besides the 3 outliers with high BPM, from around 80 to roughly 115 BPM.
Here we are looking at a self-similarity matrix of the song “Shoota”, which is one of the centre points in the first graph when we are looking at valence and energy. This song in my opinion accurately represents a shift in the sound and style of hiphop, reflecting the evolution of the genre and the changing tastes and preferences of its audience. The delivery style and flow of the rapping are very different, it has a more melodic and sing-song approach in comparison with a more straight forward style of old school hip-hop. Shoota also features a somewhat more complex and layered production style than the old school hip hop songs and therefore I think it is an interesting song to look at and give a in depth analysis using multiple grams.